A Semantic-Based Approach for Multilingual Translation of Massive Documents
نویسندگان
چکیده
This paper presents an interlingua-based framework that facilitates semantic processing of natural languages by a computer called Universal Networking Language (UNL). It is an artificial language that describes the meaning of sentences in terms of the schema of semantic nets. This framework focuses on representing all sentences that have the same meaning in all natural languages using a single semantic graph. Once this graph is built, it is possible to decode it to any other language. The paper takes care of Arabic as a source and target language and presents an evaluation of the Arabic decoding output on morphological, syntactic and semantic levels. The evaluation is based on results of translating one complete document from English to Arabic through UNL from which multilingual translations can be obtained.
منابع مشابه
English-Persian Plagiarism Detection based on a Semantic Approach
Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...
متن کاملSemantic-Based Multilingual Document Clustering via Tensor Modeling
A major challenge in document clustering research arises from the growing amount of text data written in different languages. Previous approaches depend on language-specific solutions (e.g., bilingual dictionaries, sequential machine translation) to evaluate document similarities, and the required transformations may alter the original document semantics. To cope with this issue we propose a ne...
متن کاملClustering multilingual documents by estimating text - to - text semantic relatedness
This thesis is about multilingual document clustering through estimating semantic relatedness between multilingual texts. Specifically we focus on the task of clustering multilingual documents with very limited or no supervisory information. We present two approaches to address the problem : a comparable-corpora based approach and a web-searches based approach. Our first approach derives pairwi...
متن کاملDevelopment of a Multilingual Information Retrieval and Check System Based on Database Semantic
Technical documents for multilateral agreements or international business transactions are normally produced in a bilingual or multilingual form. Being mostly of legal nature, these documents require especially accurate and speedy translations by expert translators. In order to aid these experts, automatic ways of checking translation results (such as a spelling checker) would be highly desirab...
متن کاملMT Techniques in a Retrieval System of Semantically Enriched Patents
This paper focuses on how automatic translation techniques integrated in a patent retrieval system increase its capabilities and make possible extended features and functionalities. We describe 1) a novel methodology for natural language to SPARQL translation based on a grammar–ontology interoperability automation and a query grammar for the patents domain; 2) a devised strategy for statistical...
متن کامل